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Abstract 

We present a general class of compressed sensing matrices which are then demonstrated to have associated 
sublinear-time sparse approximation algorithms. We then develop methods for constructing specialized matrices from 
this class which are sparse when multiplied with a discrete Fourier transform matrix. Ultimately, these considerations 
improve previous sampling requirements for deterministic sparse Fourier transform methods. 

1 Introduction 

This paper considers methods for designing matrices which yield near-optimal nonlinear approximations to the Fourier 
transform of a given function, / : [0,2tt] — » (D. Suppose that / is a bandlimited function so that / 6 (D N , where N is 
large. An optimal A:-term trigonometric approximation to f is given by 

7=1 

where a>\, ... , com g (—N/2,N/2] D % are ordered by the magnitudes of their Fourier coefficients so that 

\f(coi)\ > \f(co z )\ > > \f(co N )\. 
The optimal A:-term approximation error is then 

ii/-/rii 2 =ii/-/; op i- a) 

It has been demonstrated recently that any periodic function, f : [0, 2n] — » (D, can be accurately approximated via 
sparse Fourier transform ( SFT) methods which run in 0(k 2 log 4 N) time (see 11251 126 1 for details). When the function is 
sufficiently Fourier compressible (i.e., when k « N yields a small approximation error in Equation ([I} above), these 
methods can accurately approximate / much more quickly than traditional Fast Fourier Transform (FFT) methods 
which run in O(NlogN) time. Furthermore, these SFT methods require only 0(k 2 log 4 N) function evaluations as 
opposed to the N function evaluations required by a standard FFT method. 

Although the the theoretical guarantees of SFT algorithms appear promising, current algorithmic formulations 
suffer from several practical shortfalls. Principally, the algorithms currently utilize number theoretic sampling sets 
which are constructed in a suboptimal fashion. In this paper we address this deficiency by developing computational 
methods for constructing number theoretic matrices of the type required by these SFT methods which are nearly 
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optimal in size. In the process, we demonstrate that this specific problem is a more constrained instance of a much 
more general matrix design problem with connections to compressed sensing matrix constructions lfl4l l8l l9l 171 l2l. 
discrete uncertainty principles ifTTl . nonadaptive group testing procedures [18, 20], and codebook design problems 
[38 15][6l in signal processing. 



1.1 General Problem Formulation: Compressed Sensing in the Fourier Setting 

Over the past several years, a stream of work in compressed sensing has provided a general theoretical framework 
for approximating general functions in terms of their optimal A:-term approximation errors (see [19] and references 
therein). Indeed, the SFT design problem we are considering herein also naturally falls into this setting. Consider 
the following discretized version of the sparse Fourier approximation problem above: Let / 6 <E N be a vector of N 
equally spaced evaluations of f on [0,27i], and define f to be the N X N Discrete Fourier Transform (DFT) matrix 

defined by Tjj = e ^= . Note that t/ will be compressible (i.e., sparse). Compressed sensing methods allow us 
to construct an m X N matrix, Ai, with m minimized as much as possible subject to the constraint that an associated 
approximation algorithm, Ayvi : <D m — » (D N , can still accurately approximate any given f = ff (and, therefore, / 
itself). More exactly, compressed sensing methods allow us to minimize m, the number of rows in Ai, as a function 
of k and N such that 

(Mf) - f p < C M ■ fcH || f-f* || ? (2) 

holds for all / 6 <C N in various fixed W ^ norms, 1 < q < p < 2, for an absolute constant C M e R (e.g., see |[T2l[T9l ). 
Note that this implies that f will be recovered exactly if it contains only k nonzero Fourier coefficients. Similarly, it 
will be accurately approximated by Ayvi (Aif) any time it is well represented by its largest k Fourier modes. 

In this paper we will focus on constructing m X N compressed sensing matrices, Ai, for the Fourier recovery 
problem which meet the following four design requirements: 

1. Small Sampling Requirements: AVF should be highly column-sparse (i.e., the number of columns of AfF 
which contain nonzero entries should be significantly smaller than AT). Note that whenever AfF has this prop- 
erty we can compute Aif by reading only a small fraction of the entries in f. Once the number of required 
function samples/evaluations is on the order of N, a simple fast Fourier transform based approach will be diffi- 
cult to beat computationally. 

2. Accurate Approximation Algorithms: The matrix Ai needs to have an associated approximation algorithm, 
Am, which allows accurate recovery. More specifically, we will require an instance optimal error guarantee 
along the lines of Equation (|2]). 

3. Efficient Approximation Algorithms: The matrix Ai needs to have an associated approximation algorithm, 
Am, which is computationally efficient. In particular, the algorithm should be at least polynomial time in N 
(preferably, o(N log N)-time since N is presumed to be large and we have the goal in mind of competing with 
an FFT). 

4. Guaranteed Uniformity: Given only k,N e K + and p,q e [1,2], one fixed matrix Ai together with a fixed 
approximation algorithm Ayn should be guaranteed to satisfy the three proceeding properties uniformly for all 
vectors / e € N . 

The remainder of this paper is organized as follows: We begin with a brief survey of recent sparse Fourier approx- 
imation techniques related to compressed sensing in Section [2] In Section [3] we introduce matrices of a special class 
which are useful for fast sparse Fourier approximation and investigate their properties. Most importantly, we demon- 
strate that any matrix from this class can be used in combination with an associated fast approximation algorithm in 
order to produce a sub linear-time (in N) compressed sensing method. Next, in Section |4j we present a deterministic 
construction of these matrices that specifically supports sublinear-time Fourier approximation. In Section|5]this matrix 
construction method is cast as an optimal design problem whose objective is to minimize Fourier sampling require- 
ments. Furthermore, lemmas are proven which allow the optimal design problem to be subsequently formulated as 
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a linear integer program in Section [6] Finally, in Section [7] we empirically investigate the sizes of the optimized 
deterministic matrices presented herein. 

2 Compressed Sensing and The Restricted Isometry Property 

Over the past few years, compressed sensing has focused primarily on utilizing matrices, At, which satisfy the Re- 
stricted Isometry Principle (RIP) O in combination with J 1 -minimization based approximation methods 019] |7). In 
fact, RIP matrices appear to be the critical partner in the RIP matrix// 1 -minimization pair since RIP matrices can also 
be used for compressed sensing with numerous other approximation algorithms besides I 1 -minimization (e.g., Regu- 
larized Orthogonal Matching Pursuit ll32l[33ll . CoSaMP PTTl . Iterative Hard Thresholding (4), etc.). Hence, we will 
consider RIP matrices in isolation. 

Definition 1. Let p e [1, oo), N,k e N, and e € (0, 1). A matrix At with complex entries has the Restricted Isometry 
Property, RIP p (N,k,e), if 

(1 - e) \\x\f p < \\Mx\f P < (1 + e) \\x\f p 
for all x £ <D N containing at most k nonzero coordinates. 

It has been demonstrated that Fourier RIP2(N,A:,e) matrices of size O {k lo | 2 N j X N exist (36). More specifically, an 

mxN submatrix of the NxN Inverse DFT (IDFT) matrix, !F _1 , formed by randomly selecting m rows of !F _1 will 

satisfy the RIP2(N,A:,e) with high probability whenever m is Q ^clog 2 N^-^ j [35 1. Such a matrix will clearly satisfy 

our small sampling requirement since any mxN submatrix of the NxN IDFT matrix will generate a vector containing 
exactly m ones after being multiplied against the NxN DFT matrix. Furthermore, I 1 -minimization will yield accurate 
approximation of Fourier compressible signals when utilized in conjunction with an IDFT submatrix that has the R1P2. 
However, these random Fourier RIP2 constructions have two deficiencies: First, all existing approximation algorithms, 
Am, associated with Fourier RIP2(N,A:,e) matrices, At, run in O (NlogN) time. Thus, they cannot generally compete 
with an FFT computationally. Second, randomly generated Fourier submatrices are only guaranteed to have the RIP2 
with high probability, and there is no tractable means of verifying that a given matrix has the RIP2. In order to verify 
Definition 1 for a given m X N matrix one generally has to compute the condition numbers of all (v) of its m X k 
submatrices. 

Several deterministic RIP2(N,A:,e) matrix constructions exist which simultaneously address the guaranteed unifor- 
mity requirement while also guaranteeing small Fourier sampling needs l27l l5l. However, they all utilize the notion 
of coherence lfT4l which is discussed in Section [Z2] Hence, we will postpone a more detailed discussion of these 
methods until later. For now, we simply note that no existing deterministic RIP2(N,A:,e) matrix constructions currently 
achieve a number of rows (or sampling requirements), m, that are (fc 2 polylog(N)) for all k = ( Vn) as N grows 
large. In contrast, RIP matrix constructions related to highly unbalanced expander graphs can currently break this 
"quadratic-in-A: bottleneck". 

2.1 Unbalanced Expander Graphs 

Recently it has been demonstrated that the rescaled adjacency matrix of any unbalanced expander graph will be a RIPj 
matrix EE). 

Definition 2. Let N,k,d e N, and e e (0, 1). A simple bipartite graph G = (A, B, E) with \A\ > \B\ and left degree at 
least d is a (k, d, e)-unbalanced expander if for any X C A with |X| < k, the set of neighbors, |Af(X)|, of X has size 
\N(X)\ > (1 - e)d|X|. 

Theorem 1. (See [2. 2?). Consider anmxN matrix At that is the adjacency matrix of a regular (k, d, e)-unbalanced 
expander, where 1/e and d are both smaller than N. Then, there exists an absolute constant C > 1 such that the 
rescaled matrix, A\/d llv , satisfies the RIP p (N,k,Ce) for all 1 <p < 1 + 1/logN. 
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Note that the RIPi property for unbalanced expanders is with respect to the I 1 norm, not the I 2 norm. Neverthe- 
less, matrices with the RIPi property also have associated approximation algorithms that can produce accurate sparse 
approximations along the lines of Equation j2). Examples include I 1 -minimization ||2] [3) and Matching Pursuit l24l . 
Perhaps most impressive among the approximation algorithms associated with unbalanced expander graphs are those 
which appear to run in o(N log N)-time (see the appendix of Considering these results with respect to the four 
design requirements from Section |1.1| we can see that expander based RIP methods are poised to satisfy both the 
second and third requirements. Furthermore, by combining Theorem [T] with recent explicit constructions of unbal- 
anced expander graphs ll22l . we can obtain an explicit RIPi matrix construction of near-optimal dimensions (which, 
among other things, shows that RIPi matrices may also satisfy our fourth Section [T~T| design requirement regarding 
guaranteed uniformity). We have the following theorem: 

Theorem 2. Let e e (0, 1), p e [1, 1 + 1/ log N], and N, k e N such that N greater than both 1 /e and k. Next, choose 
any constant parameter a £ R + . Then, there exists a constant c € R + such that a 

O (k l+a (log N log k/e) 2+2/a ) x N 

matrix guaranteed to have the RIPp(N,k,e) can be constructed in O -time. 
Proof: Consider Theorem 1.3 in E2ll in combination with Theorem [T] above. □ 

Theorem|2]demonstrates the existence of deterministically constructible RIPi matrices with a number of rows, m, 
that scales like O (fc 1+a polylog(N)) for all k < N and fixed e, a € (0, 1). Furthermore, the run time complexity of the 

RIPi construction algorithm is modest (i.e., 0(N 2 )-time). Although a highly attractive result, there is no guarantee 
that Guruswami et al.'s unbalanced expander graphs will generally have adjacency matrices, Ai, wh ich a re highly 
column-sparse after multiplication against a DFT matrix (see design requirement number 1 in Section [lT| P] Hence, 
it is unclear whether expander graph based RIPi results can be utilized to make progress on our compressed sensing 
matrix design problem in the Fourier setting. Nevertheless, this challenging avenue of research appears potentially 
promising, if not intractably difficult. 

2.2 Incoherent Matrices 

As previously mentioned, all deterministic RlP2(N,k,e) matrix constructions (e.g., see lfl6l l27l l35l 151 and references 
therein) currently utilize the notion of coherence [14|. 

Definition 3. Let \i e [0, 1]. An m X N matrix, Ai, with complex entries is called ^-coherent if both of the following 
properties hold: 

1. Every column of Ai, denoted Ai.; € (D m for < j < N — 1, is normalized so that \\Ai. ;||2 = 1. 

2. For all j, I e [0, N) with j + I, the associated columns AI.,/, At.,; e <C m have | AI.,/ • At / < f/. 

Theorem 3. (See [351). Suppose that an m X N matrix, Ai e (C" xN ; i s ^-coherent. Then, Ai will also have the 
RIP 2 (N,k,(k-l)i.i). 

Matrices with small coherence are of interest in numerous coding theoretic settings. Note that the column vectors 
of a real valued matrix with small coherence, [i, collectively form a spherical code. More generally, the columns of 
an incoherent complex valued matrix can be used to form codebooks for various channel coding applications in signal 
processing l30ll37l . These applications have helped to motivate a considerable amount of work with incoherent codes 
(i.e., incoherent matrices) over the past several decades. As a result, a plethora of /i-coherent matrix constructions 
exist (e.g., see j38l[T5]|6l|5j, and references therein). 

'in fact, M multiplied against a DFT matrix need not be exactly sparse. By appealing to ideas from |23 j, one can see that it is enough to have a 
relatively small perturbation of M be column-sparse after multiplication against a DFT matrix. 
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As we begin to demonstrate in the next section, matrices with low coherence can satisfy all four Fourier design 
requirements listed in Section fTTT] However, there are trade-offs. Most notably, the Welch bound 1 38 ] implies that any 
/i-coherent mxN matrix, M £ (C mxN , must have a number of rows 

N 

m > 



(N - \)ii 2 + 1 

As a consequence, arguments along the lines of Theorem [5] can only use ,u-coherent matrices to produce RIP2(N,A:,e) 
matrices having m = D,(k 2 /e 2 ^j rows. In contrast, O {k lo ^ 2 N j X N Fourier RIP2(N,A:,e) matrices are known to exist 

(see above). Hence, although ^-coherent matrices do allow one to obtain small Fourier sampling requirements, these 
sampling requirements all currently scale quadratically with k instead of linearly^ 

Setting aside the quadratic scaling of m with k, we can see that several existing deterministic RIP2(N,A:,e) matrix 
constructions based on coherence arguments (e.g., E7l l5l) immediately satisfy all but one of the Fourier design require- 
ments listed in Section [TT| First, these constructions lead to Fourier sampling requirements which, although generally 
quadratic in the sparsity parameter k, are nonetheless o(N). Second, these matrices can be used in conjunction with ac- 
curate approximation algorithms (e.g., I 1 -minimization) since they will have the RIP2. Third, the deterministic nature 
of these RIP2 matrices guarantees uniform approximation results for all possible periodic functions. The only unsat- 
isfied design requirement pertains to the computational efficiency of the approximation algorithms (see requirement 3 
in Section [Li) . As mentioned previously, all existing approximation algorithms associated with Fourier RIP2(N,A:,e) 
matrices run in O (N log N) time. In the next section we will present a general class of incoherent matrices which have 
fast approximation algorithms associated with them. As a result, we will develop a general framework for constructing 
fast sparse Fourier algorithms which are capable of approximating compressible signals more quickly than standard 
FFT algorithms. 



3 A Special Class of Incoherent Matrices 

In this section, we will consider binary incoherent matrices, M £ {0, l} mxN , as a special subclass of incoherent ma- 
trices. As we shall see, binary incoherent matrices can be used to construct RIP2 matrices (e.g., via Theorem [3}, 
unbalanced expander graphs (and, therefore, RIP ; ,»i matrices via TheoremJTJ), and nonadaptive group testing matrices 
|[l"8l . In addition, we prove that any binary incoherent matrix can be modified to have an associated accurate approx- 
imation algorithm, Ayvi : <D m — > (D N , with sublinear o(N) run time complexity. This result generalizes the fast sparse 
Fourier transforms previously developed in [26 1 to the standard compressed sensing setup while simultaneously pro- 
viding a framework for the subsequent development of similar Fourier results. We will begin this process by formally 
defining (K, a)-coherent matrices and then noting some accompanying bounds. 

Definition 4. Let K,a £ [l,m] n N. An m X N binary matrix, M £ {0,l} mxN , is called (K,a)-coherent if both of the 
following properties hold: 

1. Every column of M contains at least K nonzero entries. 

2. For all j, I £ [0, N) with j + I, the associated columns, M.j and M.j £ {0, l} m , have M.j ■ M.j < a. 

Several deterministic constructions for (K, a)-coherent matrices have been implicitly developed as part of RIP2 
matrix constructions (e.g., see lfT6ll27l ). It is not difficult to see that any (K, a) -coherent matrix will be | -coherent 
after having its columns normalized. Hence, the Welch bound also applies to (K, a)-coherent matrices. Below we will 
both develop tighter lower row bounds, and provide a preliminary demonstration of the existence of fast o(N)-time 
compressed sensing algorithms related to incoherent matrices. This will be done by demonstrating the relationship 
between (K, a)-coherent matrices and group testing matrices. 

2 It is worth noting that Bourgain et al. recently used methods from additive combinatorics in combination with modified coherence arguments to 
construct explicit mxN matrices, with m = 0(k 2 ~ e ), which have the Fourier RIP2(N,k,m~ e ) whenever k — D(N 1 ' 2_e ) [5 1. Here e' > is some 
constant real number. Hence, it is possible to break the previously mentioned "quadratic bottleneck" for RIP2(N,lc,e) matrices when k is sufficiently 
large. 
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3.1 Group Testing: Lower Bounds and Fast Recovery 



Group testing generally involves the creation of testing procedures which are designed to identify a small number of 
interesting items hidden within a much larger set of uninteresting items [18, 20 1 . Suppose we are given a collection of 
N items, each of which is either interesting or uninteresting. The status of each item in the set can then be represented 
by a boolean vector x e {0,1 } N . Interesting items are denoted with a 1 in the vector, while uninteresting items are 
marked with a 0. Because most items are uninteresting, x will contain at most a small number, d < N, of ones. Our 
goal is to correctly identify the nonzero entries of x, thereby recovering x itself. 

Consider the following example. Suppose that x corresponds to a list of professional athletes, at most d of which 
are secretly using a new performance enhancing drug. Furthermore, imagine that the only test for the drug is an 
expensive and time consuming blood test. The trivial solution would be to collect blood samples from all N athletes 
and then test each blood sample individually for the presence of the drug. However, this is unnecessarily expensive 
when the test is accurate and the number of drug users is small. A cheaper solution involves pooling portions of each 
player's blood into a small number of well-chosen testing pools. Each of these testing pools can then be tested once, 
and the results used to identify the offenders. 

A pooling-based testing procedure as described above can be modeled mathematically as a boolean matrix M e 
{0, l}"' xN . Each row of M corresponds to a subset of the N athletes' whose blood will be pooled, mixed, and then 
tested once for the presence of the drug. Hence, the goal of our nonadaptive group testing can be formulated at follows: 
Design a matrix, M € {0, \} mxN ^ with as few rows as possible so that any boolean vector, x e {0, 1} N , containing at 
most d nonzero entries can be recovered exactly from the result of the pooled tests, Mx e (0, 1}"'. Here all arithmetic is 
boolean, with the boolean OR operator replacing summation and the boolean AND operator replacing multiplication. 
One well studied solution to this nonadaptive group testing problem is to let M be a d-disjunct matrix. 

Definition 5. An m X N binary matrix, M £ {0, l} mxN ; is called d-disjunct if for any subset of d + 1 columns of M, 
C = {c\, c%, ... , Cd+\} C [1,N] Pi N, there exists a subset of d + 1 rows of M, R = \]\, ]i, ... , jd+i) c [l,m] n N, 
such that the submatrix 

' M h , Cl M h/C2 . . . M n , Cd+1 * 
Mj liCl Mj 2lC2 ■ ■ ■ Mj liCd+1 

>. -M/f+i/Ci Mj m ,c 2 ■ ■ ■ Mj d+ljCi+1 , 
is the (d + 1) X (d + 1) identity matrix^ 

Nonadaptive group testing is closely related to the recovery of "exactly sparse" vectors x e R N containing exactly d 
nonzero entries. In fact, it is not difficult to modify standard group testing techniques to solve such problems. However, 
it is not generally possible to modify these approaches in order to obtain methods capable of achieving the type of 
approximation guarantees we are interested in here (i.e., see Equation Q). However, fast o(N)-time approximation 
algorithms based on d-disjunct matrices with weaker approximation guarantees have been developed fl3l . Hence, 
if we can relate (K, a) -coherent matrices to d-disjunct matrices, we will informally settle the design requirement 
regarding the existence of fast approximation algorithms (see the third design requirement in Section fTTT) . 

Lemma 1. AnmxN (K,a)-coherent matrix, M, will also be [(K - X)ja\-disjunct. 

Proof: Choose any subset of [(K — l)/aj + 1 columns from ISA, C = [c\, C2, ■ ■ ■ , C[(K-i)/aj+i} c [1,N] n N. Consider 
the column Al. iCl 6 {0, 1}'". Because M is a binary (K, a)-coherent matrix, we know that there can be at most a rows, 
j, for which Aty jCl = Mj Al = 1. Hence, there are at most a[(K- l)/aj < K - 1 total rows in which At, C] will share 
a 1 with any of the other columns listed in C. Since Ai. /Cl contains at least K ones, there exists a row, }\ e [1, m] n N, 
containing a 1 in column C\ and zeroes in all of C - \c\\. Repeating this argument with C2, ■ ■ ■ , Cug-\)ia\+\ replacing c\ 
above proves the lemma. □ 



3 This is not the standard statement of the definition. Traditionally, a boolean matrix M is said to be d-disjunct if the boolean OR of any d of 
its columns does not contain any other column [18 20]. However, these two definitions are essentially equivalent. The d-disjunct condition is also 
equivalent to the (d + l)-strongly selective condition utilized by compressed sensing algorithms based on group testing matrices 1 13]- 
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Any mxN d-disjunct matrix must have m = Q (minjd 2 log d N,N}j ifTTl . Furthermore, near-optimal explicit d- 
disjunct measurement matrix constructions of size 0(d 2 logN) X N exist [34|. Of more interest here, however, is that 
the lower bound for rf-disjunct matrices together with Lemma [TJprovides a lower bound for (K, a)-coherent matrices. 
More specifically, we can see that any mxN (K, a)-coherent matrix must have m = Q (min |(K 2 /a 2 ) log^ n N, A/j). 

In the next section we will demonstrate that ideas from previous fast compressed sensing approximation methods 
based on d-disjunct matrices IT3l can be utilized in combination with the properties of (K, a)-coherent matrices to 
obtain the type of stronger approximation guarantees we consider in this paper. In the process we will simultaneously 
decrease the previously obtained runtime complexities of these algorithms for general signals. As a result, we will 
obtain entirely deterministic sublinear-time (in N) approximation algorithms which match the runtime and approxima- 
tion guarantees previously only achieved with uniformly high probability by sublinear-time methods based on random 
measurement matrices (e.g., ET\ ). 



3.2 Properties of Binary Incoherent Matrices 

The following theorem summarizes several important properties of (K, a)-coherent matrices with respect to general 
sparse approximation problems. Most importantly, the first statement guarantees the existence of a simple sublinear- 
time recovery algorithm, Ayvi, which is guaranteed to satisfy an approximation guarantee along the lines of Equation[2] 
for all (K, a)-coherent matrices, M, and vectors x e <C N . 

Theorem 4. Let M be an m X N (K, a)-coherent matrix. Then, all of the following statements will hold: 

1. Let e e (0, 1], k £ [l/.K * H N. There exists an approximation algorithm based on a modified form of M, 
Am : <C m riog 2 N l +m _> that is guaranteed to output a vector Zs e C N satisfying 



II-? -» 1 1 ^ \\-> -> °pt 1 1 

||x-z s || 2 < ||x-x ) , F || 2 + 



22e 



-> -opt 
X ~~ X (k/e) 



for all X G (D N . Most importantly, Ayvi can be evaluated in O (m log N)-time. See Appendix^\for details. 

2. Define the mxN matrix f W by normalizing the columns of M so that f Wi i j = Mij/ -J||Atj||i. Then, the matrix 
< W will be —coherent. 

3. Furthermore, the mxN matrix 'W defined above will have the RIP2(N,k,(k — l)a/KjJ^] 



i 

4. Define themxN matrix 'W by "Wi,j = Mi.jl \\\M-,j\\-\) v ■ Then, the matrix 'W will have the 
RIP p (N,k,C(k - l)a/K)forall 1 < p < 1 + where C is an absolute constant larger than 1/2. 

5. M is l(K - \) I a\-disjunct. 

6. M has at least m = Q (min |(K 2 /a 2 ) log K/a N, Nj^ rows. 



Proof: The proof of each part is as follows. 

1. See Appendix|A] 

2. The proof follows easily from the definitions. 

3. The proof follows from part 2 together with Theorem[3] However, for the sake of completeness we will recount 
the proof in more detail here. 

4 It is worth noting that modified (K,a)-coherent matrices can also be used as Johnson-Lindenstrauss embeddings. See \ 29\ together with (T] to 
learn more about the near equivalence of Johnson-Lindenstrauss embeddings and RIP2 matrices. 
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Let X = [x\, . . . ,Xk) C [0,N). Given any such X with |X| = k, we define <Wx to be the m X k matrix consisting 
of the k columns of "W indexed by X. We will consider the k X k Grammian (and therefore symmetric and 
non-negative definite) matrix 

'UfW x = I + D x . 

Our strategy will be to bound both ||£>xlli and ||£>xlU in the hope of applying Gerschgorin's theorem. 
Each off diagonal entry (Dx)u , i + j, is the inner product of *Ws x, and Xj columns. Thus, we have 

_ M -A(, (l 

W #W " K 

since M is (K, a)-coherent. The end result is that both ||£>xlli and HDxIU are at most ^ 1 . Applying Ger- 
schgorin's disk theorem we immediately see that the largest and smallest possible singular values of 'Wx are 
£| ~ (£_x^a_ an( j ^1 - ^^2, respectively. The result follows. 

4. Note that we can consider M to be the adjacency matrix of a bipartite graph, G = (A,B,E), with |A| = N and 
|B| = m. Each element of A will have degree at least K. Furthermore, for any X c A with |X| < k we can see 
that the set of neighbors of X will have 

' a(IXl-l) 



|N(X)|> ^(K-;a)>|X|-X. 1- 

j=o V 



2K 



Hence, M is the adjacency matrix of a (k, K, (k - l)a/2] ; C)-unbalanced expander graph. The result now follows 
from the proof of Theorem 1 in (2). 



Finally, the proof of parts 5 and 6 follow from Lemma[T]and the subsequent discussion in Section 3.1 respectively. □ 



Recall that explicit constructions of (K, a) -coherent matrices exist lfl6l [271 . It is worth noting that RIP2 matrix 
constructions based on these (K, a)-coherent matrices are optimal in the sense that any RIP2 matrix with binary entries 
must have a similar number of rows ifTUl . More interestingly, Theorem |4] formally demonstrates that (K, a)-coherent 
matrices satisfy all the Fourier design requirements in Section |1.1| other than the first one regarding small Fourier 
sampling requirements. In the sections below we will consider an optimized number theoretic construction for (K, a)- 
coherent matrices along the lines of the construction implicitly utilized in ll27l l26l . As we shall demonstrate, these 
constructions have small Fourier sampling requirements. Hence, they will satisfy all four desired Fourier design 
requirements. 



4 A (K, a) -Coherent Matrix Construction 

Let Tn denote the N xN unitary discrete Fourier transform matrix, 



Recall that we want anmxN matrix M with the property that MTn contains nonzero values in as few columns as 
possible. In addition, we want M to be a binary (K, a)-coherent matrix so that we can utilize the sublinear-time ap- 
proximation technique provided by Theorem[4] It appears to be difficult to achieve both of these goals simultaneously 
as stated. Hence, we will instead optimize a construction recently utilized in [26 1 which solves a trivial variant of this 
problem. 

Let N,NeN with N > N. We will say that anmxN matrix, M, is (K, a) N -coherent if the m X N submatrix of 
M formed by its first N columns is (K, «)-coherent. In what follows we will consider ourselves to be working with 
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(K, a)jv -coherent matrices whose first N rows match a given mxN (K, a)-coherent matrix, M, of interest. Note that 
this slightgeneralization will not meaningfully change anything previously discussed. For example, we may apply 
Theorem |4j to the submatrix formed by the first N columns of any given (K, a)iv-coherent matrix, M, thereby effec- 
tively applying Theorem|4]to Al in the context of approximating vectors belonging to a fixed N-dimensional subspace 
of (D N . The last N — N columns of any (K, a)jv-coherent matrix M will be entirely ignored throughout this paper with 
one exception: We will hereafter consider it sufficient to guarantee that MTf) (as opposed to MTn) contains nonzero 
values in as few columns as possible. This modification will not alter the sparse Fourier approximation guarantees (i.e., 
see Equation (JT]i) obtainable via Theorem |4] in any way when the functions being approximated are N-bandlimited. 
However, allowing N to be greater than N will help us obtain small Fourier sampling requirements. 

Let M be an m X N (K, a)jv-coherent matrix. It is useful to note that the column sparsity we desire in MFfj is 
closely related to the discrete uncertainty principles previously considered in ifTTl . 

Theorem 5. (See H171 ). Suppose y e <D N contains Nt nonzero entries, while y — y T lF^j contains Naj nonzero entries. 
Then, NtN a , > N. Furthermore, N t N a , = N holds if and only if y is a scalar multiple of a cyclic permutation of the 
picket fence sequence in <C N containing v equally-spaced nonzero elements 



where D6N divides N. 

We will build m xN (K, a)w-coherent matrices, M, below whose rows are each a permuted binary picket fence 
sequence. In this case Theorem [i] can be used to bound the number of columns of MT^ which contain nonzero 
entries. This, in turn, will bound the number of function samples required in order to approximate a given periodic 
bandlimited function. 

We create an m X N (K, a)iv-coherent matrix M as follows: Choose K pairwise relatively primes integers 



and let N = Ylf = i Sj > N. Next, we produce a picket fence row, ry;„ for each j e [1, K\ D JN and h e [0, Sj) Pi %. Thus, 
the n th entry of each row ry, is given by 




si < • • ■ < s K 




(3) 



where n e [0, N) Pi TL. We then form M by setting 



M 



(4) 



r 2,s 2 



\ 1"K,s K -l J 



For an example measurement matrix see Figure [T] 

Lemma 2. An m xN matrix M as constructed in Equation (ml will be (K, Llog s N ^-coherent with m = YJf=\ s j- 
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ne[0,N) 

n = (mod 2) ( 1 

n = 1 (mod 2) 

n = (mod 3) 1 

n = 1 (mod 3) 

n = 2 (mod 3) 

n = 1 (mod 5) 



1 2 3 4 5 6 

10 10 1 
10 10 10 

1 1 

10 10 

10 10 

1 1 



Figure 1 : Measurement Matrix, At, Using s\ = 2, S2 = 3, S3 = 5, . . . 



Proof: Choose any two distinct integers, / + n, from [0,N). Let At.,/ and At.,„ denote the Z th and n th columns of At, 
respectively. The inner product of these columns is 

K 

M.j ■ At „ = 6 ((n - I) mod sjj . 

;'=i 

The sum above is at most the maximum a for which IlyLi s ; ^ N by the Chinese Remainder Theorem. Furthermore, 
this value is itself bounded above by [_log Si Nj. The equation for m immediately follows from the construction of At 
above. □ 

The following Lemma is a consequence of Theorem|5] 

Lemma 3. Let At be an m X N matrix as constructed in Equation (Q. Then, MTfj will contain nonzero entries in 
exactly m — K + 1 = fejLi s j) — K + 1 columns. 

Proof: Fix j e [1, K] n N. Each picket fence row, 6 {0, 1} N , contains N/sy ones. Thus, f-^f^ contains Sy nonzero 
entries for all h 6 [0,sy) Pi Z. Furthermore, r.^Tf) contains nonzero values in the same entries for all h 6 [0,Sy) PI 7L 
since all rows (with y fixed) are cyclic permutations of one another. Finally, let I, j 6 [1,-K] n N with y + I and 
suppose that f-^Tfj and r Jg^N both have nonzero values in the same entry. This can only happen if 

, N N 
h— = g— 

Sj d s, 

for a pair of integers < h < sy and < g < S/. However, since sy and S; are relatively prime, Euclid's lemma implies 
that this can only happen when h = g = 0. The result follows. □ 

We can now see that matrix construction presented in this section satisfies all four of our Fourier design require- 
ments. In the next sections we will consider methods for optimizing the relatively prime integer values, Si, . . . ,Sk, 
used to construct our (K, a)jv-coherent matrices. In what follows we will drop the slight distinction between (K, oc)n- 
coherent and (K, a)-coherent matrices for ease of discussion. 
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Minimize 

K,=\Da-] 



= E s / (5) 



m 

7=1 

subject to the following constraints: 
I. S\< ■■■ <SfC- 

n. n^ ; <w<n^i- 

III. Si, • • • ,Sk are pairwise relatively prime. 



Figure 2: Matrix Design Optimization Problem for Given N, D, and a values 

5 Optimizing the (K, a) -Coherent Matrix Construction 

Note that K appears as part of a ratio involving a in each statement of Theorem[4] Hence, we will focus on constructing 
(K, a) -coherent matrices in which K is a constant multiple of a in this section. For a given value of D e (1/°°) we 
can optimize the Section [4] methods for constructing a (Da, a)-coherent matrix with a small number of rows by 
reformulating the matrix design problem as an optimization problem (see Figure 13}. In this section we will develop 
concrete bounds for the number of rows, m as a function of D, N, and a, that will appear in any mxN (K - Da, in- 
coherent matrix constructed as per Section [4] These bounds will ultimately allow us to cast the matrix optimization 
problem in Figure |2j as a linear integer program in Section[6] 
The following trivial fact will be useful below. 

Lemma 4. Let X\,X2, ■ ■ ■ ,x„ 6 [2, oo) be such that x n > x n -\ > ■ ■ ■ > X\ > 2. Then, Y!j=\ x j - Yl'j=i x j- 

Proof: This follows immediately from the fact that 

1 + ^ZzL^ < „ < 2 «-i < Tt x . □ 

Define po - 1 and let p; be the I th prime natural number. Thus, we have 

p = l,pi -2,p 2 =3,p 3 =5,p 4 = 7,... (6) 

Suppose that S = [si, . . . , Sk) is a solution to the optimization problem presented in Figure [5]for given values of a, D, 
and N. Let p qs be the largest prime factor appearing in any element of S. Finally, let 

<7 = maxj(7s | S solves the optimization problem in Figure |2_J . 
The following lemma bounds q as a function of N, D, and a. 

Lemma 5. Suppose Sj, S2, ■ ■ • ,Sk satisfy all three constraints in Figure^ Set m = ILyii s j- Next, let pt be the smallest 
prime number greater than 2 for which 

p t ■ (K - a - 1) + (K - a - 1) (K- a - 2) + (a + 1) > m 
holds. Then, cj <t + K — a — 1. 
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Proof: 

Let s' v s' v ■ ■ ■ i s' K be a solution to the optimization problem presented in Figurej^J Set m = Yjf=i s 'j- Note that there 
must exist at least one prime, pj e [pt,pt+K-a-\), which is not a prime factor of any s'. value. If no such prime exists, 
then Lemma |4j applied to the prime factors of each containing one of the primes in [p t ,p t+ K- a -i) tells us that the 
sum of si, si, ■ ■ ■ , s' K must be 

K-a-2 a+1 

m > Y p t+j + Y s 'r 

7=0 7=1 

The second constraint in Figure [2] together with the arithmetic-geometric mean inequality tells that we must always 
have 

( a+1 



(a + l)N^r < (a + 1). 



7=1 



a+1 



II s ; * L s 'r ™ 



7=1 



Furthermore, it is not difficult to see that 

K-a-2 K-a-2 

Y Pt+j ^ Y (pt + 2 ^ - Pf(K-a-l) + (K-a-l)(K-a-2) (8) 

since pt > 2. Thus, if every prime in [p t ,p t +K-a-\) appears as a prime factor in some s'-, then 

m > p t ■ (K - a - 1) + (K - a - 1) (K - a - 2) + (a + 1) > in, 

violating our assumption concerning the optimality of si, s' 2 , - ■ ■ , s' K . This proves our claim regarding the existence of 
at least one prime, pj e [p t ,pt+K-a-i), which is not a prime factor of any s'. value. 

Now suppose that some s'. ( contains a prime factor, p\>, with /' > t + K - a - 1. Substitute the largest currently 
unused prime, pj e [p t ,p t +K-a-\), for p/< in the prime factorization of s^., to obtain a smaller value, s'j. If we can 
show that s' v s' 2 , ■ ■ ■ ,s' K with s' r substituted for s'., still satisfies all three Figure |2j constraints after reordering, we will 
again have a contradiction to the assumed minimality of our original solution. In fact, it is not difficult to see that all 
constraints other than II above will trivially be satisfied by construction. Furthermore, if s' > s' a+1 , then Constraint II 

will also remain satisfied and we will violate our assumption that the values originally had a minimal sum. 
Finally, the second case where pt < pj < s'j < s' a+1 could only occur if originally 

K K-a-l 

£s;> Y (pt + 2j) = pf(K-a-l) + (K-a)(K-a-T). (9) 

j=a+2 j=l 

When combined with Equation (jTJ above, Equation (|9jl reveals that if s' r < s^ +1 then we must have originally had 

K 

Y^ s 'j ^ + 1) N«+i + pt ■ (K — a — 1) + (K — a) (K — a — 1) > th. 

7=1 

However, in this case the assumed minimality of s^, si, . . . ,sC would again have been violated. □ 
We will now establish a slightly more refined result than that of Lemma [5] 
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Lemma 6. Suppose Si,S2, ■ ■ • ,Sk satisfy all three constraints in Figure^ Set m = Y*f=i s j- Let L = IT/li Pif or an y 
desired D£N, and let (p(L) = YlUiiPi ~~ •"■)• ^ Pt be the smallest prime number greater than 2 for which 



pt ■ (K - a - 1) + (K - a - 1) (K - a - 2) + 

K-a-2 



(L - 2</>(L) - 2r)WL) + D )[^J([^J - l) 



+(L - 2(p(L) - 2v) 



<p(L) + v 



K-a-1- (<j)(L) + v) 



K-a-2 
cp(L) + v 



+ (a + 1)N» +1 > in 



holds. Then, c] <t + K — a — 1. 
Proq/: 

We will prove this lemma by modifying our proof of Lemma [5] In particular, we will modify formulas ([8]) and 
(|9j. Note that amongst any L consecutive numbers, there are at most <p(L) + v prime numbers. Hence, we have that 
Pi+w(<p(L)+v) ^ Pi + for all uel Thus, we may replace formula dSjl with 



K-a-2 



K-a-2 



;=0 ;=0 
K-a-2 



+ L 



,=0 
K-a-2 



> £ [Pt + 2 (j-(cp(L) + v) 



<p(L) + v 

i 



(p(L) + v 

= £ (p t + 2; + (L-2tf>(L)-2r>) 



+ L 



) 



cp(L) + v 



] 



_(p(L)+v 



K-a-2 



= p t ■ (K - a - 1) + (K - a - 1) (K- a - 2) + (l - 2(£(L) - 2v) Y 



;=c 



(p(L) + v 



= p t ■ (K - a - 1) + (K - a - 1) (K - a - 2) + 

K-a-2 



(L - 2#L) - 2v)(<f>(L) + v)[^\([^\ - 1) 



+ (L - 2(p(L) - 2v) 



(p(L) + v 



K-a-1- (<p(L) + v) 



K-a-2 
(p(L) + v 



Note that amongst any L consecutive numbers, a maximal subset of pairwise relatively prime numbers has at most 
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<p(L) + v numbers. Hence, we may also replace formula (j9|) by a similar argument to above with 

K K-a-2 

Hi S 'i ^ Hi S «+2+; 



j=a+2 ;=0 

K-a-2 i 

>-Lk 

7=0 
K-a-2 



+ L 



] 



> £ U + 2ll + j-(cp(L) + v) 



;=0 
K-a-2 



(p(L) + v 

i 



(p(L) + v 

= £ lp t + 2(l + j) + (L-2ct>(L)-2v) 



+ L 



J 



<p(L) + v 



,=o 



<p(L) + v 



K-a-2 



= p t ■ (K - a - 1) + (K - a) {K - a - 1) + (l - 2<p(L) - 2d) ^ 



7=0 



cp(L)+v 



= p t ■ (K - a - 1) + (K - a) (K - a - 1) + 

K-a-2 
_ (p(L) + v _ 



(L - 2tf>(L) - 2 P )(^(L) + j - l) 



+ (L - 2cp(L) - 2v) 



K-a-1- ((p(L) + v) 



K-a-2 
_ (p(L) + v _ 



By replacing ([8]) and (|9]l with these bounds in the proof of Lemma [5] we obtain the desired result. □ 

The following corollary of Lemma [5] provides a simple initial upper bound on the largest prime factor that may 
appear in any solution to the optimization problem presented in Figure [2] 

Corollary 7. Let r be such that II/Li Pr+j <N < YlJ=i Pr+j, an d set m — Yl]=\ Pr+j- Next, let pt be the smallest prime 
larger than 2 for which 

p t ■ (K - a - 1) + (K - a - 1) (K - a - 2) + (a + 1) > m 
holds. Then, Cj <t + K — a — 1. 
Proof: 

It is not difficult to see that 



Sl = p r +l, S 2 = Pr+2, ■■■ ,SK = Pr+K 

collectively satisfy all three constraints in Figure[2] Applying Lemma [5] yields the stated result. □ 



(10) 



Similarly, one can obtain the following corollary from Lemma|6] 

Corollary 8. Let r be such that II/Li Pr+j <N < rT/=i Pr+j, an d set m = £ - =1 p r+ j. Let L = Yi!=i Pif or an y v eft$, 
and let (p(L) = YVLiiPi ~ !)• Next, let p t be the smallest prime larger than 2 for which 



p t ■ (K - a - 1) + (K - a - 1) (K - a - 2) + 
holds. Then, cj <t + K — a — 1. 



(L - 2cp(L) - 2v)(cf>(L) + r,)[|^j([^j - l) 



K-a-1 - (<j)(L) + v) 



K-a-2 
(p(L) + v 



+ (a + l)N« +l > m 
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The following lemma provides upper and lower bounds for the members of any valid solution to the optimization 
problem in Figure[2]as functions of N, D, and a. This lemma is critical to the formulation of the optimization problem 
in Figure [2] as a linear integer program in Section [6] 

Lemma 9. The following bounds hold for any valid solution, S = {si, S2, ■ ■ . , Sjc}, to the optimization problem in 
Figure^ 

1. S\ < ■ ■ ■ < s K . 

2. st > 2,s 2 >3,...,s K > p K . 

3. S\ < N« and s a+ \ > Afsi. 

4. Let t € N be defined as in Lemma^ Lemma^ Corollary^ or Corollary^ Then, < pt+K-a-1- 



Proof: 

Assertion (1) is a restatement of Constraint I in Figure|2] The second assertion follows immediately from the fact 
that the ordered s< values must be pairwise relatively prime (i.e., Constraint III). The third assertion follows easily from 
Constraint II. Assertion (4) follows from an argument analogous to the proof of Lemma [5] That is, if > pt+K-a-\, 
then we may substitute with the largest prime in [pt,pt+K-a-\) which is not currently a prime factor of si, . ..,Sjc 
and thereby derive a contradiction. □ 

The following lemmas provide concrete lower bound for m in terms of N, D, and a (see Equation <[5j in Figure|2jl. 
These lemmas will ultimately allow us to judge the possible performance of any solution to our optimization problem 
based solely on the value of a whenever N and D are fixed. 

Lemma 10. Any solution to the optimization problem in Figure^must have 

m > KN^ +(K-a){K-a-l). 



Proof: 

We know that s a+ 2 > s a+ i > N^+r from Lemma|9] Hence, we can see that 

K K-a-l 

Ysj > Yj + 2 i) ^(K-a- l)Nsi + (K - a) (K - a - 1) . 

j=a+2 j=l 

Combining this lower bound with Equation (|7]i proves the lemma. □ 



Corollary 11. Let L = Yli=i Pif or an y desired i»6N, and let (p(L) = YVi=i(Pi ~~ !)• Any solution to the optimization 
problem in Figure^must have 

(L - 2<p(L) - 2v)(fm + 4f^If^iJ - 2 ) 



m > KN^ + {K - a) (K - a - 1) + 

K-a-2 



+ {L- 2<p(L) - 2v) 



(p(L) + v 



K-a-l- (cp(L) + v) 



K-a-2 
(p(L) + v 



15 



Proof: 



Note that amongst any L consecutive integers, a maximal subset of pairwise relatively prime numbers has at most 
(p(L) + v numbers. As in the proof of Lemma 10 this corollary follows by combining Equation (jTJ) with the fact that 



K K-a-2 

^ Sj = ^ S a+2+j 
j=a+2 j=0 
K-a-2 

* E |s 

;=0 
K-a-2 



a+2+j-(^(L)+v)[^ pTl \ 



+ L 



J 



(p(L) + v 



7=0 
K-a-2 



> \Sa + i+2n+ j-((p(L) + v) 



(p(L) + v 

> + 2 (! + /') + ( L - 2 <M L ) - lv ) 



7=0 



+ L 

I 



1 



(p(L) + v 



<p(L) + v 



K-a-2 



■ (K- a - 1) + (K - a) (K - a - 1) + (l - 2<p(L) - 2v) Y 



/=o 



(p(L) + v 



= ■ (K - a - 1) + (K - a) (K - a - 1) + 

K-a-2 



(L - 2#L) - 2p)(«L) + 4f^J([f^J - 1) 



+ (L - 2<p(L) - 2v) 



<p(L) + v 



K-a-1- (cf)(L) + v) 



K-a-2 
cp(L) + v 



. □ 



In the next section we investigate asymptotic bounds of m in terms of D and N. This will, among other things, 
allow us to judge the quality of our matrices with respect to the lower bound in part 6 of Theorem|4] 

5.1 Asymptotic Upper and Lower Bounds 

We begin this section by proving an asymptotic lower bound for the number of rows in any (K, a)-coherent matrix 
created as per Section]?] Recall that we have fixed K to be multiple of a so that K = K a = \Da] for some D e (1, oo). 
We have the following lower bound for m as a function of D and N. 

Lemma 12. Suppose that 2 < D < N 1_T , where % > is some fixed constant. For any solution to the optimization 
problem in Figure^ where a can freely be chosen, one has 



m » 



D 2 (logN) 2 
log(DlogN) 



for sufficiently large values ofD log N. 
Proof: 

Let Q = DlogM Suppose that S = [s\, . . .s^} is a solution to the optimization problem in Figure [2] where a can 
be freely chosen and K = K a - \Da\ By Properties (I) and (II) in Figure [2] 



Si > 



Da 
_a + 1 



logN > 



Q 
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Let q = qs be the largest natural number such that p,, \ Ylf=i s i- F° r ^ < i < q, let W\ be the integer such that 



pf'll n£i s i- Since a + b < ab for a, b e N, it follows from Lemma 13 

Q 2 



that 



!=1 1=1 



^ logQ 



as Q — > oo. This implies that for 1 < z < ^, we have p f ' < j^jj^j, where C is some absolute positive constant. 
Therefore, 

wQ 2/w Q 



£ ^logp, <£ £ Mogp« £ (] ^ n]1/a « 



Since . a;, logp, = logs, > j, we have that 

^ w,logp,»Q 



for sufficiently large values of Q. Let 



/ CQ 2 



We have that if, e {0, 1} when p, > "\Jj^j- Also, whenever it>; > 1, it follows that logp, < log ^ log Q. Thus, 

CO 

for sufficiently large values of Q, we have W > j^g, where C is some absolute positive constant. By the Prime 
Number Theorem, 

Q 2 D 2 (logN) 2 



> J^pf > £ p, » £ zTogz» 



__ logQ log(DlogN) 

!S logQ 

for sufficiently large values of Q = D logN. □ 



1=1 1=1 ;<r C'Q ,,. C'Q 



Part 6 of Theorem 4| informs us that m must be Q (d 2 log D Nj for any mxN (K = \Da], a) -coherent matrix. On 
the other hand, Lemma|T2] above tells that the any mxN (K, a)-coherent matrix constructed via Section|4]must have 

m = O ^ i^pio^) j- Note that the lower bounds for matrices constructed as per SectionJ^jare worse by approximately 

a factor of logN. This is probably an indication that the (K - [Da], a) -coherent matnx construction in Section|4]is 
suboptimal. Certainly suboptimality of the construction in Section[4]would not be surprising given that the construction 
is addressing a more constrained design problem (i.e., we demand small Fourier sampling requirements). 

Next we show that the asymptotically best main term for m in the optimization problem in Figure[2]can be obtained 
by taking each Sy to be a prime. This proves that the asymptotic lower bound given in Lemma[T2]is tight. 

Lemma 13. Suppose that 2 < D < N 1_T , where t > is some fixed constant. If we are able to select the value for a, 
the optimization problem in Figure^can be solved by taking the Sj to be primes in such a way that guarantees that 

D 2 (logN) 2 

m <§c 



log(DlogN) 
as D log N — > oo. 
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Proof: 



Let Q = DlogN. Since we restrict to the case that 1 < D < N 1_T , it follows that as Q — > oo, we also have that 
N — > oo. Let r = max ([ wq] / 9) . Note that log r > 2. Also, by the Prime Number Theorem, p r+ i ~ Q < N 1_T log N 
as Q — > oo. We will assume that Q is large enough that p r+ \ < N. Choose a e IN such that YYJ=i Pr+j < N < 
nj=i Pr+y- F°i" 1 < * < Kft = l~Da~|, let s, = Vr+i- Note that our elements S; already satisfy the conditions in Figure^ 



We are left to establish a bound on a and then estimate s i 



Note that a < f> whenever Pr+j ^ N, which is equivalent to T^tl logp r +y ^ logN. Let j8 = [" 2 ^°|^ "| . We 
have that pk >k for k > 1. Hence, we have that 



0+1 0+1 
^logp r+ , > ^log(r + 

!=1 !=1 

> 



Xr+p+L 
log x 



= (r + jS + l)log(r + jS + 1) - (r + jS + 1) - rlogr + r 

= OS + l)(log(r + JS + 1) - 1) + r log (l + ^) 

>0S + l)(log(r + /3 + l)-l) 
>)S(logr-l) 
21ogN logr 
logr 2 
= logN. 

Note that as Q — > oo, r + Kp <sc j^Bq- Thus, since a < |3, we have by ll28l Lemma 6] and the Prime Number Theorem 
that 

~ Kp - <k s _ ((r + Kp)log(r + Kp)f 



Pr+i < > Pr+i < Ci- < C 2 



,=i ;=i lo §^ log((r + X /3 )log(r + X /i ))' 

for some absolute constants Ci and C 2 . As Q — » oo, 



((r + fy) log(r + K p )f q 2 D 2 (log N) 2 

m «: — — «: — — = — - — — . □ 

log((r + JC j3 )log(r + ^)) lo gQ log(DlogN) 



Although Lemma 13 shows that simply using primes for our sy values is asymptotically optimal, it is important 
to note that the convergence of such primes-only solutions to the optimal value as D log N — > oo is likely very slow. 
For real world values of N and D the more general criteria that the sy values be pairwise relatively prime can produce 



significantly smaller m values. This is demonstrated empirically in Section [7] However, Lemma 13 also formally 



justifies the idea that the s; values can be restricted to smaller subsets of relatively prime integers (e.g., the prime 
numbers) before solving the optimization problem in Figure [2] without changing the asymptotic performance of the 
generated solutions. This idea can help make the (approximate) solution of the optimization problem in Figure[2]more 
computationally tractable in practice. 

6 Formulation of the Matrix Design Problem as a Linear Integer Program 

To formulate the problem as a linear integer program, we define K = K a = {Da] and B = pt+K-a-i as in Part 4 of 



Lemma 



Letsy,* e {0,l}for; e [l,K]n]N and z e [l,B]nN. We then let s ; - = Y sy,;-z and, for k e [l,t+K-a-l]nN, 
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define 



_ I 1 iip k \l 

Kl ~\ otherwise - K ' 

Then, for a given a, we can minimize Equation (|5]l by minimizing 

K KB 

m =Z s /=EZ s ^ f (12) 

;=1 /=1 i=l 

subject to the following linear constraints: 
1- L?=i Sj,i = 1 for all j e [1,K\ n N. 
2. Sj,i 6 {0, 1} for all e [1, X] n N and z e [1, B] n N. 
3- Lf=i(z • sy+i,, - z • s j4 ) > 1 for all ; e [1, K - 1] n N. 

4. E/Li Ef=i S;,,- • lnf < InN < Ef =a s jA ■ Inf. 

5. s y ,, = for all e [1,K] n N and z e [l,p ; - 1] n N. 

6. Lf=i Ef=i % ' S;,i < 1 for all Jfc e [1, t + K - a - 1] n N. 

The first and second constraint together state that for each j, Sjj is non-zero for exactly one value of i 6 [1, B] n N, 
implying that si - i. This in turn, by the third constraint, implies that S\ < Si < ■ ■ ■ < s%, which is Constraint I in 
Figure |2] Upon applying the natural logarithm in Constraint II in Figure |2]to convert a nonlinear constraint to a linear 
constraint, one obtains something equivalent to our fourth constraint above. The fifth constraint above simply forces 
Sj > pj, which will be true for any solution to the optimization problem in Figure [2] The last constraint ensures that 
Si, ...,Sk are pairwise relatively prime, which is Constraint III in Figure [2] Hence, the optimization problem in this 
section is equivalent to the optimization problem in Figure [2] 



7 Numerical Experiments 

In this section we investigate the optimal Fourier sampling requirements related to m X N (K = \Da] ,a)-coherent 
matrices, optimized over the a parameter, for several values of D and N. This is done for given values of D e (1, oo) 
and N e N by solving the optimization problem in Figure|2]via the linear integer program presented in Section|6]for 
all feasible values of a € [l, log 2 N\ DJN 5 The solution yielding the smallest Fourier sampling requirement, m—K a + l 
from Lemma [3] for the given D and N values (minimized over all a values) is the one reported for experiments in 
this section. Each linear integer program was solved with IBM ILOG OPL-CPLEX with parameters generated using 
Microsoft Visual Studio. Examples of the actual files ran can be downloaded from the contact author's website]^] 

In order to make our numerical experiments more meaningful we computed optimal incoherent matrices which 
also have the RIP2 (see part 3 of Theorem Hence, we set D = ^ for a given sparsity value k e [l,N]nN and 

e e (0,1). In all experiments the value of e was fixed to be slightly less than 3/ ^4 + V6^ ~ 0.465 which ensures 
that /^minimization can be utilized with the produced RIP2 matrices for accurate Fourier approximation (e.g., see 
Theorem 2.7 in ll35l ). 

Three variants of the optimization problem in Figure 2] were solved in order to determine the minimal Fourier 
sampling requirements associated with various classes of ( , a^-coherent matrices created via Section ^ These 

three variants include the: 



5 It is important to note that many values of a can be disqualified as optimal without solving a linear integer program by comparing previous 
solutions to the lower bounds given in Lemma [To|a nd Corollary 1 1 1 1 
6 http://www.math. duke.edu/~markiwen/DukePage/code. htm 
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Figure 3: On the Left: The minimal Fourier sampling requirements, m — K a + 1 minimized over all feasible a € 
[1,14] fl N, for any possible m X 2 14 (K = [" — ^" "| , a)-coherent matrix constructed via Section^ Here e was fixed 

to be 4/ ^6 + V7) » 0.463, and the sparsity parameter, k, was varied between 2 and 11. On the Right: The minimal 
Fourier sampling requirements, m — K a + 1 minimized over all feasible a € [1,22] n IN, for any possible m X 2 22 
^ _ 7 ^ -coherent matrix constructed via Section^ Here e was again fixed to be 4/ ^6 + yffj * 0.463, and 

the sparsity parameter, k, was varied between 2 and 19. 



1. Relatively Prime optimization problem exactly as stated in Figure|2]and reformulated in Section|6] 

2. Powers of Primes optimization problem. Here the sy values are further restricted to each be a power of a single 
prime number. 

3. Primes optimization problem. Here each sy value is further restricted to simply be a prime number. 

These different variants allow some trade off between computational complexity and the minimality of the generated 
incoherent matrices. See Figure[3]for a comparison of the solutions to these optimization problems for two example 
values of N. 

In creating the solutions graphed in Figure [3] computer memory was the primary constraining factor. For each 
of the two values of N the sparsity, k, was increased until computer memory began to run out during the solution of 
one of the required linear integer programs]^] All linear integer programs which ran to completion did so in less than 
90 minutes (most finishing in a few minutes or less). Not surprisingly, the relatively prime solutions always produce 
smaller Fourier sampling requirements than the more restricted powers of primes solutions, with the tradeoff being 
that they are generally more difficult to solve. Similarly, the powers of primes solutions always led to smaller Fourier 
sampling requirements than the even more restricted primes solutions. 

For the sake of comparison, the left plot in Figureplalso includes Fourier sampling results for RIP2(2 14 ,A:,e < 0.465) 
matrices created via random sampling based incoherence arguments for each sparsity value. These random Fourier 
sampling requirements were calculated by choosing rows from an 2 14 X 2 14 inverse DFT matrix, !F _1 , uniformly at 
random without replacement. After each row was selected, the \i -coherence of the submatrix formed by the currently 
selected rows was calculated (see Definition [3]). As soon as the coherence became small enough that Theorem [3] 
guaranteed that the matrix would have the RIP2(2 14 ,A:,e < 0.465) for the given value of k, the total number of inverse 
DFT rows selected up to that point was recored as a trial Fourier sampling value. This entire process was repeated 100 
times for each value of k. The smallest Fourier sampling value achieved out of these 100 trials was then reported for 
each sparsity k in the left plot of Figure [5] 

Looking at the plot of the left in Figure [3] we can see that the randomly selected submatrices guaranteed to have 
the RIP2 require fewer Fourier samples than the deterministic matrices constructed herein. Hence, if Fourier sampling 
complexity is one's primary concern, traditional matrix design techniques should be utilized. However, it is important 
to note that such randomly constructed Fourier matrices cannot currently be utilized in combination with o(N) -time 

7 A modest desktop computer with an Intel Core i7-920 processor @ 2.67 Ghz and 2.99 GB of RAM was used to solve all linear integer programs 
reported on in Figure|3| 
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Fourier approximation algorithms. Our deterministic incoherent matrices, on the other hand, have associated sublinear- 
time approximation algorithms (see the first part of Theorem[4]i. 

Finally, we conclude this paper by noting that heuristic solutions methods can almost certainly be developed for 
solving the optimization problem in Figure [2] Such methods are often successful at decreasing memory usage and 
computation time while still producing near-optimal results. We leave further consideration of such approaches as 
future work. 
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A Proof of the First Statement in Theorem |4] 

We will prove a slightly more general variant of the first statement given in Theorem |4] Below we will work with 
(K, Cmin, a)-coherent matrices. 

Definition 6. Let K e [1,N] n N and Cm^a £ R + . An m X N positive real matrix, M £ [0,oo) mxN , is called 
(K, C m in, a)-coherent if both of the following properties hold: 

1. Every column of M contains at least K nonzero entries. 

2. All nonzero entries are at least as large as Cmin- 

3. For all j, I £ [0, N) with j + I, the associated columns, M.,j and M. r i £ [0, oo) m , have At ,j ■ M.j < a. 



Clearly, any (K, a)-coherent matrix will also be (K, 1, a)-coherent. Other examples of (K, c m i n , a)-coherent matrices 
include "corrupted" or "noisy" (K, a) -coherent matrices, as well as matrices whose columns are spherical code words 
from the first orthant of R'". In what follows we will generalize results and constructions from [26]. We will give 
self-contained proofs whenever possible, although it will be necessary on occasion to state generalized results from 
ll26l whose proofs we omit. 



A.l Some Useful Properties of (K, c m i n , «)-Coherent Matrices 

In what follows, M e [0, oo) mxN will always refer to a given mxN (K, c min , a)-coherent matrix. Let n e [0, N) n N. 
We define Ai(K, n) to be the K X N matrix created by selecting the K rows of M with the largest entries in its n th 
column. Furthermore, we define Ai'(K, n) to be the fCx(N-l) matrix created by deleting the n til column of Ai(K, n). 
Thus, if 

M h ,„ > Mj 2 ,„ > ... > Mj m , n 



then 



M(K,n) 



( M h ) 

Mi, 



and 



M'(K,n) = 



( M jlA 



M h , 2 

Mj lt2 



M h , n -\ 



M 



M 



fc,2 



M 



M 



M 



The following two lemmas motivate the main results of this section. 



(13) 



(14) 



Lemma 14. Suppose M is a (K, c mm/ a)-coherent matrix. Let n £ [0, N) D N, k £ \l,K- c min /a] D N, and x £ <C N 1 . 

r ri„„„ „t * ka „t t u„v „t KAI ITS ~.\ J ,.,;n I »„„ *i ,.„i t„ c min ii^ii 



Then, at most ^p- of the K entries of Ad'(K, n) ■ x will have magnitude greater than or equal to 



mm 

k 



ll*lll 
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Proof: 



We have that 



\{M'{K,n)-x)\> 



Focusing now on M'{K, n), we can see that 



ll*lli 



k-\\M'{K,n)-x\\ k 

— !! < — ■ ||M'(K,n)lli. 

Cmin ' I Fill c min 



\\M'(K,n%= max \\(M'(K,n)) , < max 

Ze[l,N-l]nN 11 ' 111 Ze[l,N-l]nN 



((M{K,n)\ n ,{M{K,n)\) 



The result follows. □ 



(15) 



With Lemma 14 in hand we can now prove our second lemma. However, we must first establish some notation. 
For any given x e <C N and subset S c [0, N) n N, we will let x$ e <C N be equal to x on the indexes in S and be zero 
elsewhere. Thus, 

Xi if i e S 
otherwise 



m = 



Furthermore, for a given integer k < N, we will let S° pt c [0,N) n N be the first k element subset of [0,N) n IN 
in lexicographical order with the property that \x s \ > \x t \ for all s e S° pt and f 6 [0,N) ON- S° pt . Thus, S° pt 
contains the indexes of k of the largest magnitude entries in x. Finally, we will define x^ 1 to be x s o P t, an optimal 
fc-term approximation to x. 

Lemma 15. Suppose M is a (K, c^, a)-coherent matrix. Let n 6 [0,N) n N, k e [l, K ■ c^/a] D N, S C [0, N) n N 
vv/f/; |S| < k, and x e (D N_1 . Then, M'(K, n) ■ x and M'(K, n) ■ (x - x$) will differ in at most Jp- of their K entries. 

min 

Let 1 e (D N_1 be the vector of all ones. We have that 



(M'(K,n) ■ x) ] + (M'(K,n) ■ (x-x s )) j 



(M'(K,n)-x*s)j±0 



since all the nonzero entries of M'(K,n) are at least as large as c m in. Applying Lemma 
k = Is L = |S| finishes the proof. □ 



(M'(K,n)-i s ).>c D 
with x 



■ 



and 



By combining the two Lemmas above we are able to bound the accuracy with which we can approximate any 
entry of an arbitrary complex vector x e <C N using only the measurements from a (K, Cm^a) -coherent matrix. The 
following lemma motivates the remainder of this appendix. 

Lemma 16. Suppose M is a (K,c m i n ,a)-coherent matrix. Let n 6 [0,N) Pi N, k 6 \l,K ■ c^ in /aj nN.ee (0,1], 
c e [2, oo) n N, and x 6 (D N . TfK > c ■ (ka/c^^e) then more than — • K of the K entries ofM(K, n) ■ x can be used 

\\-> -apt II 

ettx-x^Al 

to estimate x n to within — — , accuracy. 
Proof: 

Define y e (D N_1 to be y = (x ,Xi, . . .,x n _i,x n+1 , . . .,x N _i). We have 

M(K,n) ■ x = x n - (M(K r n)). n + M'(K,n) ■ y. 
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Applying Lemma jl5j with k = (k/e) demonstrates that at most -J^— entries of M'(K, n) ■ y differ from their corre- 



sponding entries in M'(K, n) ■ (y- iffi J. Of the remaining K — |jr- entries of M'(K, n) ■ y, at most will have 

' ' ' min min 

/k by Lemma 



magnitudes greater than or equal to e ■ c n 



-> -opt 

\y - y£ie) 



K-2 



i 

ka 



14 



Hence, at least 



e ■ c~ 



> C -^-K 



entries of M'(K, n) ■ y will have a magnitude no greater than 



-> -opt 

y - y£ie) 



-> -opt 

X ~ X (k/e) 



(M(K,n)-x) . 

Therefore, ( M ( Kn y. ' will approximate x„ to within the stated accurracy for more than ^= • K values j e [1, K]nN. □ 



Lemma 16 generalizes Theorem 4 in Section 3 of II26I . Thus, Lemma 16 can be used to modify the proof of 



Theorem 6 in Section 4 of [26 1 in order to prove that a variant of Algorithm 2 from |26 1 will provide instance optimal 



approximation guarantees along the lines of Equation \2 
intuitive idea is as follows: If the constant c from Lemma 



Dptj 

for general compressed sensing recovery problems rj The 
TBI is set to be at least 4, then more than half of the K entries 



of Al(K, n) ■ x can accurately estimate x n . This is enough to guarantee that the imaginary part of x n will be accurately 
estimated by the median of the imaginary parts of all K properly scaled entries of Ai(K, n) ■ x. Of course, the real 
part of x n can also be estimated in a similar fashion. Hence, given both M and Mx e C 



16 



ensures that 



Lemma 

computing N medians of K reweighted elements of Mx will allow us to accurately estimate all N entries of x. If we 
do this and then report only the largest 2k estimates in magnitude, together with their vector indexes, we will obtain an 

approximation for x which is at least as good clS Xropt — XjJ* . See Algorithm 2 and Theorem 4 in [ 26 1 for a detailed 

it 

proof in the Fourier setting. 

It is worth noting that the randomized approximation results in ll26l also generalize in this manner (i.e., Corollaries 
3 and 4 in [26 1). If a small set of rows is randomly selected from a (K, c m m, a)-coherent matrix, the resulting submatrix 
can still be used to yield an accurate instance optimal approximation for any x e <C N with high probability. The 
proof of this fact follows from Corollary [TTJbelow. However, before we can state the corollary we need an additional 
definition: For any multiset, 

s = [si, §2, ■ ■ ./Sjsj c [l,m] n N, 
we will let Ms denote the jS X N matrix formed by the /3 rows of M listed in s. In other words, 



Ms 



Ms 
Ms 



(16) 



We have the following result. 

Corollary 17. Suppose M is an m X N (K, c m i n , a)-coherent matrix. Let k £ \l,K- c^ n /a] n N, e 6 (0, 1], c 6 
[14, oo) nN, ff 6 [2/3,1), and fe <C N . Select a multiset of the rows of M, s C [l,m] n N, by independently choosing 



(17) 



values from [l,m] D N uniformly at random with replacement. If K > c ■ {k(x/c~.e) then Ms will have both of the 
following properties with probability at least a: 



8 More precisely, the error bound in Equation 21 of Theorem 6 in 1261 holds without the additional third 22 Vfc IHIi term. 
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1. There will be at least I = 21 • ln^^j nonzero values in every column of Ms- Hence, Ms(l,ri) will be well 
defined for all n G [0,N) fl N (see Equation 13 L 

2. For all n £ [0,N)nN more f/ian 1/2 of the entries in M§(l,n) -x (i.e., more than half of the values j £ [1,1] flN, 
counted with multiplicity) will have 



{M s (l,n)-x): 



(M-(U))y,„ 



-> -opt 
X ~ X (k/e) 



Proof- 



Fix n e [0, N) n N. We select our multiset, s C [1, m] n N, of the rows of M by choosing j3 elements of [1, m] n N 
uniformly at random with replacement. Denote the j th element chosen for s by sy. Finally, let P" be the random 
variable indicating whether Ais j>n > 0, and let Q" be the random variable indicating whether sy satisfies 



X ~~ X (k/e) 



(18) 



conditioned on P". Thus, P" 



1 if Alj ,„ > 0, and otherwise. Similarly, 



Ql 



1 if sy satisfies Equation 18 and P" — 1 
otherwise 



Lemma 



16 



implies that P [QJ = 1 | F? = l] > f . Furthermore, y. = E [£^ =1 | PJ, . . . ,P"] > f (E^ =1 Pj). 



Let I = Ly = i jfy- The Chernoff bound guarantees that 



P 



7=1 



4-/ 



< e is < e 2i , 



Thus, if I > 21 we can see that Ly =1 Q" will be less than ^y. with probability less than e h . Hence, if I > 21 In ( jz^) / 
then Property 2 will fail to be satisfied for n with probability less than fe?. Focusing now on /, we note that 
p [p» = l] > I so that p = E [Z] > |jS. 

Let 1-21 In (fz0- Applying the Chernoff bound one additional time reveals that P < Fl < <b p) Hence, 
if we wish to bound P [/ < F] from above by ^ it suffices to have fi 2 - |f^F+ P > 0. Setting j3 > 1.36 • fl = 
28.56 ■ ^ ln(^) achieves this goal. The end result is that M§ will fail to satisfy both Properties 1 and 2 for any 
ne[0,N)nW with probability less than fer, Applying the union bound over all n e [0,N)nN finishes the proof. □ 

Corollary [TTJconsiders selecting a multiset of rows from a (K, c^^, a)-coherent matrix. Hence, some rows may be 
selected more than once. If this occurs, rows should be considered to be selected multiple times for counting purposes 
only. That is, all computations involving a row which is selected several times should still be carried out only once. 
However, the results of these computations should be considered with greater weight during subsequent reconstruction 
efforts (e.g., multiply selected rows should be considered as generating multiple duplicate entries in Ms • x). 

In this paper we are primarily concerned with guaranteed approximation results. Hence, we will leave further 
consideration of randomized approximation techniques to the reader. Instead, we will now consider fast approximation 
algorithms for (K, c m [ n , a)-coherent matrices. 
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A.2 Sublinear-Time Approximation Techniques 

Consider the proof of Lemmajl6jwith c = 4 for a given mxN (K, c min , a)-coherent matrix At, k E [l,K- c^ in /«] f~l N, 
and e e (0, 1]. Let x e (D N and suppose that 



( , 



x n >26 = 2 



-> -opt 
T — T 

(k/e) 



(19) 



for some n e [0,N) n N. We will begin this section by quickly demonstrating a means of identifying n using only the 
measurements Alx € C" together with some additional linear measurements based on a modification of our incoherent 
matrix Al. This technique, first utilized in iTPTl . will ultimately allow us to develop the sublinear-time approximation 
schemes we seek. However, we require several definitions before we can continue with our demonstration. 

Let {ft e R" ,xN and C e R" ,xN be real matrices. Then, their row tensor product, {ft © C, is defined to be the 
(m -m) XN matrix whose entries are given by 

{{ft © C)i j — 3K% mod m,j ' Ci-imodm ■■ 



In this section, we will use the row tensor product of At with the + |~log 2 N]j X N bit test matrix lfT3ll20l to help us 

The (l + [log 2 Nl) X N bit test matrix, B N , is defined by 



identify n from Equation 
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f 1 if i = 

- I _ ^th bit in the binary expansion of if i 6 [l, Tlog 2 AT|] 

for < f < riog 2 Nl and < j < N. For example, "B% has the form 



(20) 



#8 



r 1 1 1 1 1 1 1 1 

10 10 10 1 

1 1 1 1 

i 1 1 1 1 J 



We will now demonstrate that (At © Sjv) x contains enough information for us to identify any n 6 [0, N) DN satisfying 



Equation 19 



Notice that AtoSw contains At as a submatrix. This is due to the first row of all ones in Sn- Similarly, the second 
row of Sn ensures that At © Sn will contain another m xN submatrix which is identical to Al, except with all of its 
even columns zeroed out. We will refer to this mxN submatrix of At © Sn as At dd- We can see that 

M odd = M®(S N ) 1 = (0 At. A Al.,3 At, 5 ...). 

Furthermore, we define 

Ateven := At - At©(S N )i = At - Atodd- 

Clearly, if we are given (Al © Bn) x, we will also have Mx, At dd^, and Ateven^" G C 1 ". We can use this information 
to determine whether n from Equation 19 is even or odd as follows. 

Lemma 16 with c = 4 guarantees that more than K/2 distinct elements of At? € <D m will be of the form 



{Mx)j = x n -Mj A + {M'{m,n))j-y 

for some / e [l,m] and y e (D N_1 with |(At' (m,ri))j ■ y| < Cm^S (see Equations 
Lemma 16 1. Suppose n is odd. Then, for each j satisfying Equation 21 we will have 



14 



and 



19 



(21) 

together with the proof of 



|(At ev en*)y| = (AC even (m,n)y) j < c min <5 < \x n \ ■ Mj,„ - \(M' dd(m,n))j ■ y| < |(Al ddx) ; | • 



9 We could also use the number theoretic A^sj matrices defined in Section 5 of |26| here in place of the bit test matrix. More generally, any 
1-disjunct matrix with an associated fast decoding algorithm could replace the bit test matrix throughout this section. Note that a fast 0(f)-time 
binary tree decoder can be built for any (xN 1-disjunct matrix anytime one has access to Q(N) memory. 
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Algorithm 1 Approximate x 



l : Input: An m x N (K, c^, a)-coherent matrix, M, and (M ®S N )xe C" ,rio & N "l +m 
2: Output: z s » x s , an approximation to ;? fc opt 



3: Initialize multiset S <— 0, z <— On, & <— 0[i O g 2 Ni 



Identify All tie[0,N)nN that Satisfy Equation [19 

for j from 1 to m do 

for i from to Tlog 2 N] - 1 do 
if \(M ® (S N ) M x)\ > \(Mx - M® (S N ) M x) ,| then 

b ; <- 1 
else 

end if 
end for 

n ^ L riog 2 Ni- 1&;2! 

end for 



Estimate x$ ~ £° pt Using Lemma 



16 



15: for each n value belonging to S with multiplicity > | do 

16: Re {z„} ^- median of multiset {Rce {(M(K, n) ■ x) h / n)) ft n } | 1 < ft < k} 

17: Ini {z„} ^ median of multiset {lni {(A4(X, n) ■ x) h / (M(K, «)),, „} | 1 < ft < k} 



end for 

Sort nonzero z entries by magnitude so that |z„J > |z„ 2 | > |z„ 3 | > . . . 
S ^ {n lr n 2 ,... r n 2 k} 
Output: Zs 



Similarly, if n is even, then for each such j we will have 

|(Moddx) ; j = (M' odd (m,n)y) j < c^d < \x n \ ■ M j>n - \{M evea {m,n)) r y\ < \{M evea x)\ . 

Therefore, we can correctly determine n mod 2 by comparing [(Alodd^yl with [(Aleven^yl whenever both Equa- 
tions 19 and 21 hold. Of course, there is nothing particularly special about the lowest order bit of the binary represen- 
tation of n. More generally, we can correctly determine the z' th bit of n e [0,N) Pi N by comparing |(A1 © (&N)i+i 

with I [(M - M®(S N ) i+1 )x\ ' 



19 



and 



21 



hold. 



We now know that we can correctly determine n whenever both Equations 19 and 21 hold by finding its binary 



representation one bit at a time. Furthermore, Lemma 16 with c > 4 guarantees that more than K/2 of the j £ [l,m] 
will satisfy Equation |2T] for any given n. Hence, we can correctly reconstruct every n for which Equation [T9| holds 
more than K/2 times by attempting to decode its binary representation for all j e [l,m]. Utilizing these methods 
together with ideas from Section A.l we obtain Algorithm[T] 

In light of the preceding discussion, we can see that Algorithm [T] will be guaranteed to identify all n e [0,N) fl N 

can then be used to estimate x n for each of these n values 



19 



at least | times each 



that satisfy Equation 
as previously discussion in Section |A.l 
accurately estimated 



Lemma 
The end resu 



16 



t is that all relatively large entries in x will be identified and 
By formalizing the discussion above, we obtain the following result, the proof of which is 
analogous to the proof of Theorem 7 in Section 5 of |26l . 



Theorem 18. Suppose M is an m X N (K, c m i n , a)-coherent matrix. Furthermore, let e 6 (0, 1], k 6 



1,K 



ec 2 . \ 

mm I 

4a J 



nN, 
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and x £ <C N . Then, Algorithm^will output a z$ G <C N satisfying 



\\x-z s \\ 2 < 



opt I 



22e 



-> -opt 
X ~ X (k/e) 



Algorithm^can be implemented to run in O (m log AT) time. 

The runtime of Algorithm [T] can be accounted for as follows: Lines 4 through 14 can be implemented to run in 
0(m log N) time since their execution time will be dominated by the time required to read each entry of (Ai ® Sn) x € 
(j-™pog 2 Nl+m Counting the multiplicity of the 0(m) entries in S in line 15 can be done by sorting S in 0(mlogm) 
time, followed by one 0(m)-time scan of the sorted data. Lines 16 and 17 will each be executed a total of 0(m/K) 
times apiece. Furthermore, lines 16 and 17 can each be executed in O(XlogX) time assuming that each Ai(K, n) 
submatrix is known in advance^ Thus, the total runtime of lines 4 through 18 will also be O(mlogN). Finally, line 
19 requires that at most 0(m/K) items be sorted, which can likewise be accomplished in 0{m logN) time. Therefore, 
the total runtime of Algorithm [T] will be 0(m logN). 



'If each M(K,n) submatrix is not computed once in advance this runtime will be O(mlogm) instead of 0(KlogK). 
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